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ABSTRACT " I ' 

Rudman an<i colleagues (1980) deplored the paucity of 
descriptive information relative to teaches' test use patterns. The 
present study addresses the abundant prescriptive, and lack of 
-dascriptive information concerning teacher testing. A mailed survey 
procedure gathered testinc practice information from elementary and 
secondary South Dakota scl Ool teachers (n-336) regarding: (3) testing 
context, (2) test construction, (3) test administration, (4) test 
analyses, and (5)' test results. The survey indicated that teJachers 
use ajyariety of testing techniques, but Snly teacher-made Objective 
tests play a major evaluat ive role 1 across 'all grade levels and 
cur ricular areas. There appear to be three important factors] which 
influence teacher practice: time, expertise, and tools available for 
teachers* use. Nearly 20 percent of in-cia|ss time is devoted to 
test-related activities.. This substantial time investment is a strong 
■argument 7 for skill in the .practice of testing; however, most teachers 
'have limited preparation i|n the arsa. Improved practices require 
changing the habits of teachers and educating them to overcome their 
lack of knowledge of sophist icated ! tools (e*g., calculators, 
microcomputers)! Perhaps the most clear need is for a return to 
development of measurement techniques thad >ill be appropriately used 



in the classroom. 
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The, Practice tot Testing in Elementary 
,j , ' ''and secondary Schools * f Gullickson, Arlen R. 

Educational aeasurenent issues which, reach us in overt 

i -' 

wa^t# e.g. , through the press, typically m deal with standard- 
ifced neasuyes of aptitude and achievement. Yet Vhe »ost 
pejjryasive use cf neasureient occurs ill .the context of normal 
cl^jjsrccm routine. Such neasurement through formal and in- 
foip^al assessment processes, forms an important b*sis of 



i am i 



coijmunacat len among teacher, student, and parents. 



This 



communication tends to ^te personal, not public, low profile, 
i t e.,| not involving or engendering public discussion, and is 
clntrolled fc^ the teacher. Because the communication has 
thd^e characteristics its measurement basisMs rarely sul>- 
ject to close ^scrutiny. 

What arje the measurement practices of teachers? Hore 
specifically, what .is the context in which tests are given; 
ho* are tests constructed, administered, analyzed, and re- 
ported? These are questions pertinent to the improvement 5f 



testing pr 



actices; guestions which teachers might ask them- 



selves | in .jjself reflection; and questions which measurement 

• J- I' * 
specialists aus£ address in helping teachers ,to use tests 



effectively. |.i > 

Measurement specialists* (cf. Hopkins and Stanley , 198 1) 
view evaluation processes as interacting with educational 
objectives: and learning experiences which together comprise 



the educat 

U*. DEPARTMENT O 1 



xotxdl process. Whether evaluation prdcesses, in 
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particular |^sts, actually do function in this Banner is 
open to question, Rudaan and^his colleagues (IBT, 1980, 
p. 20), after a review of literature covering nearly 60 
years, deplored the paucity of descriptive infornation rela- 
tive to teachers* test use patterns. Their review lakes it 
clear that while, prescriptive infornation is abundant, the 
lack of descriptive data lakes it impossible to deteraine if 
the prescriptions fit, are appropriate to, practice.. This 
study was - initiated to address that prescription/practice 
gap and focused on the teacher testing practice questions 
posed earlier. ' 

A aailed survey procedure was used to gather the^inforna- 
tion froa teachers who were sampled. from the South Dakota 
directory of teachers in elementary a^d secondary schools. 
In all 75X, 336 of a total of t»50 teachers, stratified by 
grade level, (grades 3, 7, and 10) and curricular area isci,- 
ehce, social science, and language arts), responded, to the 
questionnaire. a In each case the cover letter as<ed the 
teacher to respond relative to personal testing -practices. 

The teacifers who responded to the questionnaire appear to 
be typical of^ teachers in general. That is, they' are col- 
lege graduates holding at least a bachelor's degree with a 
quarter (24*) having a master's or higher level degree. 
They ate experienced , v teachers the - majority (50%) having 
taught 10 years oc aore. Ninety-five percent teach at least 
three classes a day and the aajority have at least three 
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course < preparations. The majority have taken onl/ one 

r 

course <57») or nc course (S*) in educational measurements, 
but for a large majority (8»%) 'other courses have provided 

some infomaticn about the*preparation of tests. ' 1 

i 

Almost all. of. these teachers use tests nith -89% of the 

elementary .teacher and 99* of the secondary teachers (junior 

and senior high school) reporting such Use. Not only do 

thej test, but they do so. frequently. Virtually all test on 

a weekly (95J) cr at least a biweekly (98%) basis. In this 

testing process they u,se a variety of v testing techniques, 

but only teacher-made objective tests play a major evalua- 

tive role- across all grade levels and curricular area's. \ 

The questionnaire tc which teachers responded Mas Luilt 

on the premise' that test use Is cyclical 1 /in nature. That 

'is, a test* xs initiated to aieet specified purposes; prepara- 

tions are made, the test is administered' and analyzed, and 

the. results ate used in* 'the confte'xt of intended purposes. 

Thus in responding teachers first provided contettual infor- 

. \ . ' 

nation. Then in the order ^ited above "they answered items 

regarding theyr personal testing practices. 

RESULTS . ? 

Hespcnses *ere analysed by. grade and curricular level to 
identity practices which are related to those two variaoles. **. 
^Hhere significant effects vere found, they are reported. In 
those situations where the dependent variaoles had interval 
scale characteristics, and several dependent variables were 

' 1 



analyzed together, multiple .analysis of variance, techniques 
were e alloyed (SAS, 1*79). (there individual dependent varia- 
bles were analyzed, if the dependent variable had interval 
^scale characteristics, analysis of variance techniques were 
used. If, only frequency founts were available for the de- 
pendent variables, chi-square and contingency table analyses 
were conducted. ! 



Iks isiliaj ttUiJi 

When queried as to the role that several different types 
of tests had it their evaluation ot /students , teachers re* 
ported teacher-aade objective tests as having the greatest 
role, essay tests as. having the secor.d largest role, fol- 
lowed by standardized objective, tests and oral quizzes, of 
the four, objective tests received auch higher ratings than 
did all of the ether three. Essay tests received high -raC- 
xngs at the secondary level but very' low ratings at the ele- 
nentary level. In general, the role of testing in* the class- 
rooa increases froa the'eleaentary to the secondary level 
(Note 1). Ihe role of testing also differs significantly 
but' not substantially across curricula. 

Testing is a tiae copsiiaing activity. For exaaple, in the' 
use, of teacher-aade test£, soae teafchers report spending 
aore than nine hoars per individual test in the various test 
related activities. The typical, i.e.. aedian teacher, re- 
ported spending slightly over three hours (190 ainutes) on 

y it 



test related activities'. fioughly this breaks out to 60 «i- 
nutes for test preparation, . 30 ainutes for test correction 
and 20 minutes for post test review. 

Given this background of teacher experience, 4 the role of 
testing for teachers, and the amount of time teachers spend 
in the context of testing, teachers, were asked which of sev- 
eral purposes classroom tests were expected to fill, six se- 
parate purposes were identified and for each of those pur- 
poses the teacher was asked to rate on a four-point basis 
which constituted the purposes for which' they used classroom 

tests (0 = net a purpose, 1 = lirior, up to 3 = «a1or purpose 

) 

df the .test). Cf the six, three received aean ratings of ap- 
proximately 2.6i . These were: instructional feedback for 
student learning (2.64), P evaluation of instruction (^.62) , 
and evaluation (grading) of students (2.58). Motivation of 
student study ranked fourth in ratings (2.23). The regaining, 
two, assessment ot the attitudes or interests of students 
(1.54) and providing opportunity for student input into 
evaluation of instruction (1.47) received substantially low- 
er ratings. 

Iwo, evaluation or grading of students and the assessment 
of attitudes and interest varied by grade level. Teachers 
placed less emphasis 09 gtading purposes at the elementary 
level' and progressively more emphasis^ through the senior 
high. The mean rating at the elementary level was 2.34 with 
a mean rating of 2.7 at the senior high level. 



Mean ratings 09 the assessment on the attitudes and in- 
terests of studetts noted in just the opposite, direction, 
being highest at the eleaentary level (1.81) and auch lower 

* * 

at the secondary level, i.e.-, 1.36 for junior high and 1.46 
for senior high, respectively. Clearly, teachers perceive 
tests as serving ran instructional purpose both for feedback 
to the students and fcedtack to the teacher with grading of 

students maintaining an important role in that feedback. 

* 

! 

) 

lest isuiii asiisii 

Teachers were, asked afcout their source of iteas, the 
types cf iteas that .they constructed, whether or not their 

tests covered all of the Material they teach, and whether or 

/ 

not they reuse their tes,ts in subsequent seaesters. They 
identified two primary sources for iteas. . First, 93% view 
tKeaselves as a priaacj source of iteas, i.e. they write 
their own iteas - . Second, 60% also report using test iteas 
prepared by the publisher of the textbook which they are us- 
ing. Iwp cthet sources, other published tesrt iteas and test 
iteas prepared ty other teachers were identified as priaary 
sources by sucstan tially fewer teachers (23% and M%i re- 
specti vely) . 

In three cf these areas, there were grade level or curri- 
cular area differences. , A slightly lower percentage of ele- 
aentary teachers write their own iteas (85%) as opposed to 

t « 

96* for the secondary teachers. Also, elementary teachers 



are note prone to use textbook iteas, 75% vs. 61* and <*7* 
for elenentary, janior high and senior high school, respec- 
tively. Third, although iteas prepared by other teachers 
were not the piiaary source for «ost teachers, junior high 
teachers were note prone to share iteas than we*re other 
teachers, i.e. 20* for junior high Vs. 9% for elementary or 

senior high teachers. 

• * • ♦ * 

When asked to identify what types of iteas they noraally 
coustruct, aost (54*) teachers checked several, i.e. four to 
five separate itei types. The aost popular type of itea was 
short aiiswer/ccnpleticn (92*) followed >y Batching and nul- 
tipJre-choice (77* and 76* respectively), followed by true/ 
false (68*), and finally, essay iteas (58*). The use of 
aultiple-Choice and essay iteas both differed significantly 
across grade • level with fever teachers at the elementary 

levels choosing those items than at the secondary level. The 

# 

use of true/false iteas differed across curricular area with 
■ore teachers in the social sciences choosing to use true/ 
false iteas than teachers in either science or language arts 
areas (83X vs. 69* and 55* for social science, science and 
language arts, respectively). ; 

Two other general perspectives of test preparation were 
provided. One, even though? teachers prepare their own tests 
they do not perceive the test as adeguately evaluating all 
that they teach. Father, the average teacher perceives tests 
to covet apfroxiaately 75* of the aaterial taught.Ssecond, 
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once teachers have prefaced tests they tend to reuse those 
fc y tS Ac the £uture * Eight-four percent of the teachers re- 
port reuse of their tests of which 60* report reusing all or 
major parts of the tests and 25* report reusing selected 
items. Thus, fcr «ost, the preparation of the test does - not 
require totally constructing a new test each tine a test is 
administered. 

Testing appears to be a formal, constrained situation in 
which students expect tc be graded. Virtually all teachers 

(99*) do not allow student interaction during the testing 

• 

process. A substantial percentage, 26*, do not even allow 
students to ask guestions of the teacher. In addition stu- 

• / 

dents axe ccnsttaincd in their use of support material. Set- 
enty-jiine percent of the teachers do not allow students to 
use their text took, notes, etc., in completing a test. An 
exception to this general statement on support jfeterials oc- 
curs in the use of calculators. Mhile in general, 89% of 
the teachers 1c not use the calculators, in t£#"*tfrea bt sci- 
ence vfaerfe calculator use might be most prominent, 40* ot 
senior high school teachers allow use during tests. (it 
seems likely that this percentage would be substantially 
higher if teachers of physics and chemistry in grades 11 and 
12 were guetied.) 



Teachers were also asked whether, or not students were to 

- v • 

provide answers in the test booklet or on separate answer 
sheets. Bhile nest require students to answer in the test 
booklet, a sutstantial^ninority, 36%, do require the use of 
a separate answer sheet. This seeaed important froa two 
perspectives. First, if the tests were speeded, that is giv»- 
en within tiae constraints such that nany students cbulfr not 
finish the test, the use of separate answer sheets' would be 
a substantial concern, second, the use of a separate answer 
sheet provides opportunity for test booklets to be reused. 

Speededness of tests- appear* not to be a problea as aost, 
92%, provide sufficient tiae such that aladst all students 
have as nuch tine as they need to finish the test, .iegarding 
potential reuse, of those which ulse separate, answer sheets 
17* say it is solely for reuse of the test bdoklet,. 38V say 
such use is solely for administrative ease in scoring; and 
24* say it is for both administrative ~ea$e in scoring and 
reuse of the test booklet. Thus", approximately 20* of all 
teachers set up the administration of the tests so that scor- 

* 

ing of the test is facilitated and approximately 10X of the 

teachers .set up test administration procedures to facilitate 

future reuse of the test booklets. 
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Teachers were asked to rate on a four-point oasis — al- 
ways, usually, soaetiaes, and* never — , to respond to several 
iteas with regard to their scoring and grading practices. 
Here teachers report that they do their own scoring and 
grading of tests, i.e. 95 to 97%, respectively, either al- 
ways or usually grade their own tests. (Junior high teachers 
report teing slightly less likely to score and grade their 
own tests than are eleaentary and senior high teachers.) 

Typically, teachers assign a test grade rather than pro- 
viding only a tuaerical score. In this context, rarely do 
teachers just assign a pass/fail grade to student tests, 
(Bean score of 3f9 where 3.0 equals soaetiaes and 4.0 equals 
never). Belated to this* aost teachers (78%) use a criter- 
ion reference scheae fee; grading tests; only 10% use a curve 



basis fcr grading. Here criterion reference was used in the 



etc. in addition to scoring and grading tests, 90% report 



providing written ccaaents to students on at least an occa- 
sional tasis, with 55% reporting they always or usually do. 

A second set cf iteas asked teachers to identify which 
statistics they used in working with test results. Here 



tion. ninety percent indicate that they provide a total test 
score. Less than half, 429, obtain the range of test scores. 
Pew, .10% to 13%, use such infornafcion as the aean, aedian 



context of the exaaple 90% or better for an A, 80 - 90% a. 
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and standard deviation. A fairly large aiaority of teachers 
reported use o£ itea difficulty and reliability information, 
31% and 29% respectively. m 

Clearly aany teachers erred in checking reliability and 
itea difficulty. For eiaaple, not only would, it be unwise 
to talk about the reliability of the test without gaining 
intonation afccut the variability (standard deviation) of* 
test scores, tut calculation of the reliability requires 
knowledge of the standard deviation. Also, calculation ot 
item difficulty, i.e. the percent of correct responses for 
each itea, requires substantially aore effort than does cal^ 
'culation of the aean, aedian or standard deviation. r Thus, 
the high perceitt of response for itea difficulty and reli- 
ability suggest that aany teachers do not have an adequate 
understanding cf either the teras or hov such information is 
obtained f roa test results. 

Teachers atteapt to return test results to the students 

in a tiaely aannec; only 6% reguired aore than two days to 

process tests fcr return, 83X returned the testte within- one 

day of the. test, and'. 71 indicate that they return .the tests 

the sale day. 
» 

Teachers we're asked to apportion tiae spent with the 

/ 

class in revicn of the test into three categories: 1) review 
of scoring acd grading procedures., 2) review of individual 



J 



test items, based on individual students request, and 3) re- 

> - 

view of individual items based upon the teacher's perusal of 
student results". Ihe average, teener indicates that 16% of 
• the tine is^ spent in review of grading procedures, .41% in 
the review' of individual test items based on student re- 
quests, and H3% in the review of iteis and item groups based 
on teacher perusal of tests. When viewed in the context of a 
median total time of 20 ninutes spent in the class review of 
tests, fchis breaks. down to (approximately 9 minutes spent on 
items chosen by the teacher ,\ 8 minutes on items chosen by 
the students, and three minutes spent in the review of grad- 
ing procedures. / 

F^all'y, teachers were asked whether students were al- 
lowed to keep their tests, if tests were returned to the 
teacher and thus not available for individual student re- 
viewi or whether tests were retained by the teacher and stu- 
dents were allowed to review the tests under supervision. In 
each case the teacher was to respond on an always, usually, 
sometimes, or never basis. Here, as might be expected, there 
were significant differences across grade levels. At the 
elementary level, the average teacher "usually" let students 
keep the tests. At the secondary level; teachers only "some- 
times" let the students keep the tests. Commensurate with 
those findings, secondary level teachers more frequently re- 
tain the tests but allow students to review the tests under 
supervision. 
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A significant proportion of class tine and teacher time 
is, devoted to the activity of testing. If one estimates an 
overall average of 45 minutes per day, five days a week, is 
given to each class, and ff it is also ertimated that a 
teacher-made objective test is administered every other 
week, then nearly 20* of in-class time is devoted to test 
related activities. frobably even a higher percentage of 
total teacher work time is given to test activities. This 
substantial ti«e investment is a strong argument for requir- 
ing teachers skilled in the practice of testing, and for de- 
veloping efficient testing techniques. 

But, as the results show, most teachers have very limited 
preparation in the area of testing. In the statje of South 
Dakota, for eiample, collegiate programs routinely provide 
two semester credit hours of educational measurements to 
meet certif icatiop requirements. Any additional test infor- 
mation is provided at the discretion of individual faculty 
in net hods courses. Other results suggest this limited edu- 
cational experience is inadequate. 

There are at least three, tentative indicators that wha- 
tever is taught in pre-service courses does not spill over 
- into sound testing practices by teachers. First, in the pre- 
paration of tests, short answer and matching items are the 
most popular items of choice. Both types tend to be limited 
to lower cognitive level, i.e. knowledge level, assessment 

i 

o 
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only lcwer 



(Hopkins and Stanley, 1981). Thus tests probably assess 



cognitive level understandings. 



Second, ! while the larg^ majority of teachers reuse items, 

the time or make the effort to systemati- 
cally improve their iteas. This is suggested by the minimal 



'few teachers take 



amount of 



■ : i 

time given to test analysis (barely enough i to 



scor<s and grade tests) and by the minimal use of test sta- 
tistics. As a direct result, test item improvement must be 



don$ 



on a ijiexyi ad bcc and subjective basis, 



Third, 



teachers appear to 



testis. > ! oi the surface 1 teachers 



advocacy of ctiteiricfn re- 
ferenced testitg would indicate evidence qf a firm criterion, 



However, even if teachers 
a topic not addrfssedf in 



referenced testing foundation, 
clearly define their test domain 

this jsurvey — they clearly do not address quality of litems 

# ■ f 

in a manner which would insure th »ir items functiob a|s <iie- 



sired. Host reuse their items but 



lysis. Thus, criteria established by teachers are both ar- 



without knowing how litems 
to! accurately set criterion 
j Begardless of the domain 



tificial and subjective. For 
function, it is not possible 

levels for student performance. 

l • 

b^ing tested, a test may be prepirid which is very difficult 
or very easy. Also, the Cognitive level of the test may be 
shifted so that only knowledge level, items are addressed or 
higher cognitive level items are addressed as well. Besults 
of this study Suggest that neither test difficulty nor the 



misuse criterion referenced 



i i 



without careful item ana- 
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cognitive level of iteis has been adequately addressed oy 
•teachers, ihus criterion referenced' test ing is simply a word 
and not an accomplished fact. 

Potentially, the consequences of £hese concerns are suo- 
stantial. if tests are oriented toward lower* cognitive lev- 
els and students axe graded on their attainment of such 
knowledge, students must be notivated to focus on lower cog- 
nitive level learning. Also, because teachers grade on a 
criterion referenced basis but without a-priori knowledge 
of bo if their tests will function, their expectations of stu- 
dents and their rewarding of students, grades, praise> etc., 
■ust vary as a function of test quality. such testing ef- 
fects jseea undesiratle! 

There appear to be three important factors which i*nflu- 

I 

ence teacher practice. They are tine, expertise, and tools 
available f!or the teacher*s use. Given the already substan- 
tial amount of tine that teachers apply to the testing prac- 
tice, it seems unlikely that teachers can substantially add 
to the amount of time presently being used. Thus, teachers 
must either reorient their time (for example by using less 
time in test preparation and more time in test analysis), or 
they must find more efficient methods for handling the pro- 
cess of testiog. 

Quite likely if the testing routines of teachers were 
studied iu depth, there would be numerous ways of simplify- 
ing the testing practice process and improving its efficie-n- 
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cy. These new techniques .could then be brought to teachers 
through in-service and pre-service instruction to improve 
teacher knowledge and effectiveness in the area of testing. 
Such efforts alcne would not be sufficient. There remain a 
substantial proportion of the teachers who are either unin- 
formed cr misinformed about basic testing concepts, e.g. 
• reliability. Suoh concepts need to be re-presented to 

teachers in ways 'which are compatible with their testing si- 

V 

tuation so that conceptual concept understanding, grows rath- 
er than deteriorates over time. 

The use of tools available to the teacher is the third 
area that seems very appropriate to pursue. Hhile at first 
glance it would appear that the tools available to those in 
testing have remained constant over the past years, iu fact 
a number of substantial changes have been made. For exam- 
ple, .the advent of the pt^to copy machine essentially elimi- 
nates the need to retype an item each iiwfe it is used. The 
hand-held calculator makes computation of means, standard 
deviations, and even reliabilities a relatively straight 
forward and short process. Also, the microcomputer is sure 
to facilitate the development of items and item analyses, as 
well as the individual testing of students. 

Personal experience suggests that it is a rare teacher 
who stores items in a manner which allows test preparation 
without the need for individual item typing. Also, most 
teachers are relatively unfamiliar with the more sophisti- 
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cated calculators which call do Means, standard deviations, 

o 

and reliabilities in a straight forward manner and they are 
unfamiliar with t-he possibilities which exiilt in microcompu- 
ters. Thus, iiprcv€d practices requires -changing the habits 
of teachers/ and educating them to overcome their lack of 
knowledge and feat of the more sophisticated % tools. Even 
then teachers may need to be pursuaded that the payoff from 
improved tests is commensurate with the added effort. 

If teachers are to improve their testing habits, and it 
seems important that they do, they will need assistance. 
This entails practical help in making them more efficient in 
their daily testing hatits and new ideas and expertise in 
testing. Perhaps what is most clear is -the need to return 
to the basics' cf measurement.; That is, a retutn to develop* 
ment of measurement techniques that will be appropriately 
used in the classroom. 
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